57 research outputs found

    Comparative Genomics Reveals Adaptation by Alteromonas sp. SN2 to Marine Tidal-Flat Conditions: Cold Tolerance and Aromatic Hydrocarbon Metabolism

    Get PDF
    Alteromonas species are globally distributed copiotrophic bacteria in marine habitats. Among these, sea-tidal flats are distinctive: undergoing seasonal temperature and oxygen-tension changes, plus periodic exposure to petroleum hydrocarbons. Strain SN2 of the genus Alteromonas was isolated from hydrocarbon-contaminated sea-tidal flat sediment and has been shown to metabolize aromatic hydrocarbons there. Strain SN2's genomic features were analyzed bioinformatically and compared to those of Alteromonas macleodii ecotypes: AltDE and ATCC 27126. Strain SN2's genome differs from that of the other two strains in: size, average nucleotide identity value, tRNA genes, noncoding RNAs, dioxygenase gene content, signal transduction genes, and the degree to which genes collected during the Global Ocean Sampling project are represented. Patterns in genetic characteristics (e.g., GC content, GC skew, Karlin signature, CRISPR gene homology) indicate that strain SN2's genome architecture has been altered via horizontal gene transfer (HGT). Experiments proved that strain SN2 was far more cold tolerant, especially at 5°C, than the other two strains. Consistent with the HGT hypothesis, a total of 15 genomic islands in strain SN2 likely confer ecological fitness traits (especially membrane transport, aromatic hydrocarbon metabolism, and fatty acid biosynthesis) specific to the adaptation of strain SN2 to its seasonally cold sea-tidal flat habitat

    Integrative Annotation of 21,037 Human Genes Validated by Full-Length cDNA Clones

    Get PDF
    The human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this area is still limited. Several lines of investigation have been used to elucidate the structure and function of the genes in the human genome. Even so, gene prediction remains a difficult task, as the varieties of transcripts of a gene may vary to a great extent. We thus performed an exhaustive integrative characterization of 41,118 full-length cDNAs that capture the gene transcripts as complete functional cassettes, providing an unequivocal report of structural and functional diversity at the gene level. Our international collaboration has validated 21,037 human gene candidates by analysis of high-quality full-length cDNA clones through curation using unified criteria. This led to the identification of 5,155 new gene candidates. It also manifested the most reliable way to control the quality of the cDNA clones. We have developed a human gene database, called the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/). It provides the following: integrative annotation of human genes, description of gene structures, details of novel alternative splicing isoforms, non-protein-coding RNAs, functional domains, subcellular localizations, metabolic pathways, predictions of protein three-dimensional structure, mapping of known single nucleotide polymorphisms (SNPs), identification of polymorphic microsatellite repeats within human genes, and comparative results with mouse full-length cDNAs. The H-InvDB analysis has shown that up to 4% of the human genome sequence (National Center for Biotechnology Information build 34 assembly) may contain misassembled or missing regions. We found that 6.5% of the human gene candidates (1,377 loci) did not have a good protein-coding open reading frame, of which 296 loci are strong candidates for non-protein-coding RNA genes. In addition, among 72,027 uniquely mapped SNPs and insertions/deletions localized within human genes, 13,215 nonsynonymous SNPs, 315 nonsense SNPs, and 452 indels occurred in coding regions. Together with 25 polymorphic microsatellite repeats present in coding regions, they may alter protein structure, causing phenotypic effects or resulting in disease. The H-InvDB platform represents a substantial contribution to resources needed for the exploration of human biology and pathology

    Integrative annotation of 21,037 human genes validated by full-length cDNA clones.

    Get PDF
    publication en ligne. Article dans revue scientifique avec comité de lecture. nationale.National audienceThe human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this area is still limited. Several lines of investigation have been used to elucidate the structure and function of the genes in the human genome. Even so, gene prediction remains a difficult task, as the varieties of transcripts of a gene may vary to a great extent. We thus performed an exhaustive integrative characterization of 41,118 full-length cDNAs that capture the gene transcripts as complete functional cassettes, providing an unequivocal report of structural and functional diversity at the gene level. Our international collaboration has validated 21,037 human gene candidates by analysis of high-quality full-length cDNA clones through curation using unified criteria. This led to the identification of 5,155 new gene candidates. It also manifested the most reliable way to control the quality of the cDNA clones. We have developed a human gene database, called the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/). It provides the following: integrative annotation of human genes, description of gene structures, details of novel alternative splicing isoforms, non-protein-coding RNAs, functional domains, subcellular localizations, metabolic pathways, predictions of protein three-dimensional structure, mapping of known single nucleotide polymorphisms (SNPs), identification of polymorphic microsatellite repeats within human genes, and comparative results with mouse full-length cDNAs. The H-InvDB analysis has shown that up to 4% of the human genome sequence (National Center for Biotechnology Information build 34 assembly) may contain misassembled or missing regions. We found that 6.5% of the human gene candidates (1,377 loci) did not have a good protein-coding open reading frame, of which 296 loci are strong candidates for non-protein-coding RNA genes. In addition, among 72,027 uniquely mapped SNPs and insertions/deletions localized within human genes, 13,215 nonsynonymous SNPs, 315 nonsense SNPs, and 452 indels occurred in coding regions. Together with 25 polymorphic microsatellite repeats present in coding regions, they may alter protein structure, causing phenotypic effects or resulting in disease. The H-InvDB platform represents a substantial contribution to resources needed for the exploration of human biology and pathology

    Analysis of proteolytic processing sites in potyvirus polyproteins revealed differential amino acid preferences of NIa-Pro protease in each of seven cleavage sites.

    No full text
    Potyviruses encode a large polyprotein that undergoes proteolytic processing, producing 10 mature proteins: P1, HC-Pro, P3, 6K1, CI, 6K2, VPg, NIa-Pro, NIb-RdRp, and CP. While P1/HC-Pro and HC-Pro/P3 junctions are cleaved by P1 and HC-Pro, respectively, the remaining seven are processed by NIa-Pro. In this study, we analyzed 135 polyprotein sequences from approved potyvirus species and deduced the consensus amino acid residues at five positions (from -4 to +1, where a protease cleaves between -1 and +1) in each of nine cleavage sites. In general, the newly deduced consensus sequences were consistent with the previous ones. However, seven NIa-Pro cleavage sites showed distinct amino acid preferences despite being processed by the same protease. At position -2, histidine was the dominant amino acid residue in most cleavage sites (57.8-60.7% of analyzed sequences), except for the NIa-Pro/NIb-RdRp junction where it was absent. At position -1, glutamine was highly dominant in most sites (88.2-97.8%), except for the VPg/NIa-Pro junction where glutamic acid was found in all the analyzed proteins (100%). At position +1, serine was the most abundant residue (47.4-86.7%) in five out of seven sites, while alanine (52.6%) and glycine (82.2%) were the most abundant in the P3/6K1 and 6K2/VPg junctions, respectively. These findings suggest that each NIa-Pro cleavage site is finely tuned for differential characteristics of proteolytic reactions. The newly deduced consensus sequences may be useful resources for the development of models and methods to accurately predict potyvirus polyprotein processing sites

    Evidence for bacterial origin of heat shock RNA-1

    No full text
    The heat shock RNA-1 (HSR1) is a noncoding RNA (ncRNA) reported to be involved in mammalian heat shock response. HSR1 was shown to significantly stimulate the heat-shock factor 1 (HSF1) trimerization and DNA binding. The hamster HSR1 sequence was reported to consist of 604 nucleotides (nt) plus a poly(A) tail and to have only a 4-nt difference with the human HSR1. In this study, we present highly convincing evidence for bacterial origin of the HSR1. No HSR1 sequence was found by exhaustive sequence similarity searches of the publicly available eukaryotic nucleotide sequence databases at the NCBI, including the expressed sequence tags, genome survey sequences, and high-throughput genomic sequences divisions of GenBank, as well as the Trace Archive database of whole genome shotgun sequences, and genome assemblies. Instead, a putative open reading frame (ORF) of HSR1 revealed strong similarity to the amino-terminal region of bacterial chloride channel proteins. Furthermore, the 5′ flanking region of the putative HSR1 ORF showed similarity to the 5′ upstream regions of the bacterial protein genes. We propose that the HSR1 was derived from a bacterial genome fragment either by horizontal gene transfer or by bacterial infection of the cells. The most probable source organism of the HSR1 is a species belonging to the order Burkholderiales

    Accurate quantitation of allele-specific expression patterns by analysis of DNA melting

    No full text
    Epigenetic and genetic mechanisms can result in large differences in expression levels of the two alleles in a diploid organism. Furthermore, these differences may be critical to phenotypic variations among individuals. In this study, we present a novel procedure and algorithm to precisely and accurately quantitate the relative expression of each allele. This method uses the differential melting properties of DNAs differing at even a single base pair. By referring to the melting characteristics of the two pure alleles, the fractional contribution of the two alleles to any unknown mixture can be mathematically resolved. These methods are highly accurate and precise because each single melting reaction yields multiple data points for analysis. Finally, we discuss how this approach can be used more generally to accurately quantitate gene expression relative to known standards

    Metagenomic analysis of the human microbiome reveals the association between the abundance of gut bile salt hydrolases and host health

    No full text
    Bile acid metabolism by the gut microbiome exerts both beneficial and harmful effects on host health. Microbial bile salt hydrolases (BSHs), which initiate bile acid metabolism, exhibit both positive and negative effects on host physiology. In this study, 5,790 BSH homologs were collected and classified into seven clusters based on a sequence similarity network. Next, the abundance and distribution of BSH in 380 metagenomes from healthy participants were analyzed. It was observed that different clusters occupied diverse ecological niches in the human microbiome and that the clusters with signal peptides were relatively abundant in the gut. Then, the association between BSH clusters and 12 human diseases was analyzed by comparing the abundances of BSH genes in patients (n = 1,605) and healthy controls (n = 1,540). The analysis identified a significant association between BSH gene abundance and 10 human diseases, including gastrointestinal diseases, obesity, type 2 diabetes, liver diseases, cardiovascular diseases, and neurological diseases. The associations were further validated by separate cohorts with inflammatory bowel diseases and colorectal cancer. These large-scale studies of enzyme sequences combined with metagenomic data provide a reproducible assessment of the association between gut BSHs and human diseases. This information can contribute to future diagnostic and therapeutic applications of BSH-active bacteria for improving human health

    Comparative validation of methylation controls from commercial sources with the synthetic reference materials.

    No full text
    <p>Methylation levels of commercial DNA methylation controls and synthetic reference materials were measured by melting- (<b>A</b>) and NGS-based analyses (<b>B</b>).</p
    corecore